Co-segmentation inspired attention module for video-based computer vision tasks

نویسندگان

چکیده

Video-based computer vision tasks can benefit from estimation of the salient regions and interactions between those regions. Traditionally, this has been done by identifying object in images utilizing pre-trained models to perform detection, segmentation and/or pose estimation. Although using is a viable approach, it several limitations need for an exhaustive annotation categories, possible domain gap datasets bias that typically present models. In work, we propose utilize common rationale sequence video frames capture set objects them, thus notion co-segmentation frame features may equip model with ability automatically focus on task-specific improve underlying task’s performance end-to-end manner. regard, generic module called “Co-Segmentation inspired Attention Module” (COSAM) be plugged any CNN promote based attention among features. We show application COSAM three video-based namely: (1) person re-ID, (2) Video captioning, & (3) action classification demonstrate able frames, leading notable improvements along interpretable maps variety tasks, other as well.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Object-based visual attention for computer vision

In this paper, a novel model of object-based visual attention extending Duncan’s Integrated Competition Hypothesis [Phil. Trans. R. Soc. London B 353 (1998) 1307–1317] is presented. In contrast to the attention mechanisms used in most previous machine vision systems which drive attention based on the spatial location hypothesis, the mechanisms which direct visual attention in our system are obj...

متن کامل

Quantifying Attention in Computer-based Tasks

Attention-to-task is one of the most important Human cognitive abilities, allowing an individual to selectively focus on a speci c issue (among many possible sources) and e ectively carry out a task. Without this ability to focus, the individual would constantly switch between stimuli, hardly concluding any task. While attention can be in uenced by many internal and external factors, the purpos...

متن کامل

Video segmentation based on active vision

In this paper we present a novel approach for video shot segmentation based on active vision techniques. A continuity function is introduced taking into account the results from biological vision and experiments and discussions are carried out.

متن کامل

Temporally Object-based Video Co-Segmentation

In this paper, we propose an unsupervised video object cosegmentation framework based on the primary object proposals to extract the common foreground object(s) from a given video set. In addition to the objectness attributes and motion coherence our framework exploits the temporal consistency of the object-like regions between adjacent frames to enrich the set of original object proposals. We ...

متن کامل

Vector Quantization Enhancement for Computer Vision Tasks

This paper augments the Bag-of-Word scheme in several respects: we incorporate a category label into the clustering process, build classifier-tailored codebooks, and weight codewords according to their probability to occur. A size-adaptive feature clustering algorithm is also proposed as an alternative to k-means. Experiments on the PASCAL VOC 2007 challenge validate the approach for classical ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computer Vision and Image Understanding

سال: 2022

ISSN: ['1090-235X', '1077-3142']

DOI: https://doi.org/10.1016/j.cviu.2022.103532